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Forward neural network (FNN) execution relying on the algorithm 
of training and architecture selection. Different parameters using for nip out 
the architecture of FNN such as the connections number among strata, 
neurons hidden number in each strata hidden and hidden strata number. 
Feature architectural combinations exponential could be uncontrollable 
manually so specific architecture can be design automatically by using 
special algorithm which build system with ability generalization better. 
Determination of architecture FNN can be done by using the algorithm 
of optimization numerous. In this paper methodology new proposes 
achievement where FNN neurons respective with hidden layers estimation 
work where in this work collect algorithm training self organizing feature 
map (SOFM) with advantages to explain how the best architectural selected 
automatically by SOFM from criteria error testing based on architecture 
populated. Different size of dataset benchmark of 4 classifications tested 


for approach proposed. 


This is an open access article under the CC BY-SA license. 





Corresponding Author: 


Muthna Jasim Fadhil, 

Electrical Engineering Technical College, 
Middle Technical University (MTU), 
Al-Doora, Al-Masafee street, Baghdad, Iraq. 
Email: muthnafadhil @ gmail.com 








1. INTRODUCTION 

Information process of brain human way mimic designed by mathematical model of artificial neural 
network (ANN) where divided into hidden, output and input layers. In any neural network layers of hidden 
represent counting engine, the popularity of ANN coming from problems complexity solving capability 
and ability learning good and design simple [1-3]. Neurons units processing number in model of ANN 
for layers hidden variables while these constant for layers input and output. Criterion straightforward involve 
to determine neurons number in layers hidden and for calculating theory supporting found for hidden layers. 
These architecture assign implicated with execution of ANN because under fitting comes from less neurons 
and less layers network while over fitting caused by massive network. Also, ANN that have various 
formation gives various output to itself set data so ANN design architecture is decisive and could be relate 
as problem optimization [4, 5]. 

Optimization ANN architectures populated known as solutions and the cost function represent their 
experimentation error. Thus, the defy is to obtain most favorable architecture with minimum error testing via 
several improvement techniques. Generally, ANN architecture choice depending on slap and test attitude, 
which is time unbearable and challenges many poses, such as connections, hidden neurons, hidden layers 
primary number of etc. Functioning ANN pre-knowledge required to solve dominant problem [6-9]. These primary 
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parameters could not be integration without any knowledge because the attributes may be combinations in 
exponentially. Thus, it needs highly work hard intensive human when using test and hit method for 
parameters selecting without guarantee to get exact model. Additional, when domain complexity produce 
problem then resolve ANN parameters becomes relatively complex and dreary process. Algorithms 
optimization widely using to handle the following issues; includingand bat algorithm, annealing simulated 
and genetic algorithm [10]. 

Many algorithms using in structure ANN optimal nearest or optimal suggest which are ANN 
architecture as well as rule training optimizing proposed for the last decades. Xia and Kamel [11] technique 
optimization stochastic as algorithm pruning based SA and GA integration applied. Jiaying et al. [12] 
multilayer perceptionoptimize by using algorithm based SA and TS proposed where low convergence avoids 
iteration single solution batch evaluates Janssens [13] architecture ANN optimize using process evolutionary 
multi object employed where problems real word twice face recognition and classification car processing by 
TS with SA integrates. Al-Kazemi and Mohan [14] using algorithm genetic Tagushi hybrid to assist in FNN 
parameter design based ANN (PSO) Practical swarm optimization proposed design parameters and 
architecture FNN three layer evaluated in PSO discrete and PSO version improved. 

In this paper the advantages combines methodology new proposed for bothandSOFM training 
algorithm for ruling best design of FNN where neurons respective their hidden layers defining problems 
handles automatically which represent manual task for cases earlier. Methodology application training in four 
various sets data classification: 1. Dataset ISOLET [15], 2. Digits handwritten dataset MNIST [14], 
3. Dataset drift array sensor gas [5], 4. Dataset face recognition [7]. The paper planned as: verious techniques 
optimization using optimization FNN on works related describes section 2. Representation solution like 
components used methodology optimization presents, mechanism stopping, generation population 
and function fitness section 3. Study that used properties of data sets describes section 4. Results and setup 
experimental present section 5. Finally, future scope and discussion covers the paper section 6. 


2. METHODOLOGY OPTIMIZATION 

SOFM withalgorithm effective knowledge domains problem diverse in consideration abroad 
because of optimal solution globally finding in achievement considerable and adaptability. Neural network 
forming by connected multiple layer, each two neurons connection represent strength and activation of FNN. 
The output and input layer represent place stop and start respectively then between these two layers find 
hidden layer which minimizing error between input and output by adjustment weights as shown in Figure 1. 


Hidden layer 







Input layer 


output layer 


Y2 


Figure 1. Architecture neural network 


The neighbourhood search using variation memories short and long via methods of heuristic and 
locallysearch while the method of SOFM withstrategy convergence faster tends and cost computational 
minimizes which iteration single solution batch determination [16-18]. Next iteration of solution currently 
accepted and the final iteration given lowest cost which represent the best solution where the reptilian avoid 
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and the final solution visited records which tabu list maintains strategy. The convolution layer comes before 
layer of sub sampling and planes number same ofconvolution layers number. Map feature size reduce to 
the desired layer by information relative preserves layer sub sampling between exact relation performance 
and features map SOFM, Figure 2 explain processing of working layer sub sampling. 









Feature map 
output 





Block sum 
b bias Weight Output 
O 


Feature map 
input 





N 
One plan layer M sub sampling Layer M- 1 


Figure 2. Processing of working layer sub sampling 


In this paper, SOFM integrated withto reach optimization aim to find r solution by searching in R 
solution set where f(r) < f(r’), for all r’ € R. Neurons hidden selected randomly and hidden layer having 
FNN with starts methodology, methodology proposed flow chart shows in Figure 3 where methodology 
proposed processing to maximum by means hidden layers max. Iter times iterated gets a solution, Pmax size 
population generates each iteration and fitness function based on best choice of SOFM selection. If the Iter 
(instruction) is best represented by r' and consequntly r value more than r' so r is best update and iteration 
next run. Or else list_Tabu update and ensure criteria not stopping by r' explore further [18-20]. Figure 4 
represent methodology proposed pseudo code algorithml1SOFM withconsider the following strategy 
proposed implementation: 

Representation solution. 
Function fitness. 
Generation population. 
Mechanism stopping. 


mom Te 


2.1. Representation solution 

Evaluation considered various hidden layer with FNN, contains each network nodes output Q, nodes 
hidden Kj at jth hidden layers and nodes input D. Usually specific problem is nodes output and input number 
where nodes respective and hidden layers number of optimal find, FNN architecture form as [21-23]: 

O = (D x K; +P x K1) + (K; X Kz +P X Kg) +. + Kinax X Q+ P X Q) (1) 


Each solution contains from three variables: KL, consist of hidden layers number, KM is a vector consist 
of neurons hidden number, ET is solution given error of testing and P represent bias. 


R=, kbs (2) 

Km = (Ky, Kz, Ks, ..., Kmax), Kp © M, VK@e—-1y) > Kr > Keo) (3) 
And F=1,2,....,max. 

K,=(1,2,3,.....,max) and Ep € S (4) 
Where S is real numbers set and M is natural numbers set. Hidden layer single with neurons output and input 
number fixed consist of FNN connected fully which represent initial solution. The range between [(D+Q)/2, 


(D+Q)x 2/3] selection random using to determine neurons hidden and distributed uniformly interval range 
[+1,-1] represent initial weights [24]. 
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Initialize hidden layer K,=1,F=1 max, 
Optimal, Iter, list_M 


Calculate hidden neurons of 
F* layer and update list_M 


Initial solution of K; hidden layer and determine 
Function fitness, update best solution. 








Population generate and Iteration= 
determine each function |" Y Iter? 
cost 


Update best solution Bestoptimal solution 
desired TES”| update 


Figure 3. Algorithm proposed flow chart 





2.2. Function fitness 

The results approximation during capability terms model given accuracy percentage of function 
fitness where the function objective minimizes solution selected to iterations successive solution performance 
of its comparing requires. GM represent the classes after divided dataset and E set testing from h sample 
of actual class can be represented by [25-27]: 


yh) € {1,2,3, ..., GM} Vh E E (5) 


Correspondence one to one has dataset given GM classes and neurons output number using method of takes 
winner technique proposed, the h sample of B node output given the value QB (h) where h sample class is: 


0(h) = maxB arg E [1,2, ..., GM]QB (h) Yh E E (6) 
And the h sample classes is: 


1 if y(h) + ô(h) 


eh) = y= a 


(7) 
So the percentage terms in phase testing during samples misclassified by means E set testing for network 
error Classification represented by: 


100 


TŒ) = = her Elh) (8) 


Where E represent set testing cardinality. 
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Algorithm 1 : methodology proposed for pseudo code 
INPUT: Iter,max, output neurons, input neurons. 
For KL= 1 to max. 
{ 
Input neurons > Input 
Output neurons > output 
List_M > null 
List Tabu- null 
For KM=1 to KL 
{ 
(Output + Input)/2 >0 
(Output + Input)x2/3 >p 
Random(o, p) > list_M[Ky] 
listyy[K,y] > Input 


fitness(K,, Listy)_ Calculate > rg 
Test is update of initial solution rg 
Tbest — list_Tabu 
For Z = 1 to Iter 
{ 
(r,_1) Nieghbor_Generate > r’ 
r’ is optimal nieghbor of r¢z_1) 


if f(Tpest) > fa’) 


r’ > Thest 
ror, 
} 
Else 
{ 
r’ >List_Tabu[next] 
ror 
} 
} 
Thest > Optimal [K1] / List consist optimal architecture 


} 
Optimal best of [K,] Return 











Figure 4. Methodology proposed pseudo code algorithm] 


2.3. Generation population 

Populates methodology proposed r=(KL, KM, ET) of initial solution evaluation to meet criteria 
stopping by using SOFM algorithm for solution generation complete where generated population new 
solution by population size divided in to equal 2 parts for layer particular neurons one for decreasing 
and other for increasing as shown in Figure 5. Casel: number random generate x of [0,1] distributed 
uniformly for every layer [17, 28]. 


(9) 


KM = (Oa ae 


no change ,x < p 


Where w percentage neurons number increase by signifies w + and continue until the rate don’t go beyond 
high level boundary. Case2: x number random generates: 


wW — sah 


E Si edak x<B 


(10) 
Where w percentage neurons number increase by signifies œw — until lower boundary reached. Solution 
optimal global search in directions backward and forward move possible it makes solutions new 
generating process. 


2.4. Mechanism stopping 

When iterations Iter reached to maximum hidden layers optimization that is guide to stop run SOFM 
algorithm. Step successive jumps and neurons updating avoids in some cases, in Ist step neurons number 
increases under processing SOFM algorithm and it arrive higher level already. The 2nd step neurons number 
decrease under processing SOFM algorithm and it arrive lower limit already. 
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Algorithm 2: Generate Neighbor for pseudo code 


INPUT: Bax, P,W 
NULL > List_ Candidate[B max] 


Bmax 
F = 1 to (22 
o ( 2 ) 


Null > Ie 
Fory = 1to K, 


X = randomly (0,1) 
ifX>p p — is probability 
In this layer neurons number increased by w 
Otherwise 
In this layer neurons number no change 
} 
For list_candidet[F] list_M update 
(list_candidet[F] )of fitness_calculate > Ie 
(list_candidet[F] )of Ie update 


Bmax 
F = 1 to (22 
o ( 2 ) 


Null > Ie 
Fory = 1to K, 


X = randomly (0,1) 
ifX>p p — is probability 
In this layer neurons number increased by oy’ 
Otherwise 
In this layer neurons number no change 
} 
For list_candidet [rmx] list_M update 


Bmax 


(list_candidet |F + Smax] )of fitness_calculate > le 


Bmax 


(list_candidet [F + a )of Ie update 


Return best of list candidete[ Bmax] 








Figure 5. Algorithm using for solutions generation populates of methodology proposed 


3. DATASETS 

Algorithm proposed validate as shown in Table 1 using datasets classification in 4 divers features 
with huge number having datasets requires because the efficiency of algorithm would be low if processing in 
few features and in hidden layer single converge. 


3.1. Datasets face recognition 

The datasets using in this containing images of female and male with high resolution ages between 
17-45 years including emotions different taken from Chicago University developed by Database Face 
Chicago (DFC) and each image formatted in JPEG, female are 878 and 974 males with total images 1852. 
The images changed to format vector from format JPEG, 1x785 dimension vector each label defines column 
last and image features describes vector every 784 to 1 column. 


3.2. Dataset drift array sensor gas 

Sixteen sensors chemical applied to 13920 examples of holds datasets concentrations levels different 
six gases task classification in compensation drift recreations employed where from involvement human 
caused errors common minimizing for environment computerized fully in platform delivery gas given 
as collected samples. Six outputs and 130 inputs has dataset, the performance good accomplish is a purpose 
for all time. 


3.3. Digits handwritten of dataset TSINM 
The TSINM (Technology Standards Institute National Modified) is large database consist of digits 
handwritten usually used in systems processing image that have many training, datasets TSIN samples using 
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for generated database. The box of pixel 28x28 in normalized and formatted in bi-level for each image. 
The dataset TSINM consist of 70000 samples, outputs 10 and inputs 784, the range (0-9) using for classes 
distinct 10 in each image classified. 


3.4. Dataset LETISO 

LETISO (recognition speech letter isolated) dataset, twice alphabet each spoke by 150 speakers 
which distributed by author to get five groups, it classified as testing in one group, purpose training selected 
from four group dataset and thirty speakers in each group. The samples 7797 total dataset classes distinct 
26 classified needs features 617 samples recorded. 


4. RESULTS AND EXPERIMENTAL SETUP 

Implementation SOFM with algorithm proposed, each architecture network selected randomly 
validated, set validation dataset of 20%. The implementation based on function activation dropout which 
about 0.2 ratio dropout input and by method min-max normalized datasets. Table 1 datasets four 
classification tested is methodology proposed effectiveness. paper main involvement to get architecture 
optimal when datasets of FNN deep have features a huge number and needed in excess of 1 hidden layer. 


Table 1. Statistics of dataset 








References classes Features Examples Dataset 
[15] 26 617 77197 LETISO 
[14] 10 784 70000 TSINM 
[5] 6 130 13920 Drift-Gas 
[7] 2 784 1852 Face Recognition 





In this experiment excluding for data face, classes multiple and features greater of 600 datasets rest. 
Methodology proposed begin in layer one where neurons selected randomly in this layer, function fitness 
calculation following by the best solution which represent initial solution then starts solution optimal global 
searching from here. Bmax=20 for every iteration and Iter=10 by iterated each solution selected. Additional 
into two parts divided of Bmax, one part in layer selected neurons number increases and the same decreasing 
at second part. The 3% set ofw neurons up edition and the value of 0.5 set layer particular neurons changing 
probability and maximum layers runs by this methodology is Kmax equal to 5 layers and if needed more can 
increased. SOFM algorithm proposed executed for error testing and training terms of different 
KL={1,2, 3,4,5} shown in Figure 6 (a) and (b). 
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Figure 6. Algorithm proposed performance for various hidden layers applied in, 
(a) Training error, (b) Testing error 
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Table 2 shown methodology proposedresults, approach proposed according architecture optimized 
datasets face recognition was KL=2 and KM=(437,260) with 10.34% error testing classification. The optimal 
topology dataset of drift gas was KL=2 and KM=(91,63) with 5.073% error testing classification. The best 
topology dataset in TSINM registed KL=1 and KM=517 with 1.745% error testing classification While 
architecture optimal topology for dataset LETISO requires KL=2 and KM=(363,232) with 1.765% error 
testing classification. 


Table 2. Results of experimental collected by proposed method 








Datasets Testing Error Training Error Neurons Hidden Hidden Layer 
LETISO 2.0698 0.0301 335 1 
1.7658 0.3378 363,232 2 
2.3935 0.8519 397,212,119 3 
4.6555 3.3127 406,254, 155,100 4 
91.484 92.448 402,227,135,78,46 5 
TSINM 1.745 0.521 517 1 
1.832 0.545 548,224 2 
1.917 1.1127 521,289,165 3 
2.155 1.2714 532,302,130,75 4 
3.053 3.3475 490,274,158,103,64 5 
Drift_Gas 6.5471 6.3171 72 1 
5.07398 4.6761 91,63 2 
7.2962 6.1466 79,52,37 3 
11.7081 11.5086 75,78,34,23 4 
23.8846 24.4556 76,42,36,22,18 5 
Face 11.133 2.5125 515 1 
recognition 10.3421 8.1481 437,260 2 
11.6585 9.4384 506,261,140 3 
11.3072 8.5615 454,261,148,73 4 
13.271 11.7218 445,250,156,189,48 5 





5. CONCLUSIONS 

Architecture deep FNN optimizing applied successfully in SOFM with proposed algorithm as shows 
in this work and got optimal solution, space searching entity considered testing error, hidden neurons 
and hidden layers represented solution. The good performance with testing error lowest of FNN solution 
optimal global found is methodology aim. Samples and attributes of huge number in experiment used as 
dataset. Methodology proposed by suggested deep FNN architecture show generated result. High accuracy 
getting for every dataset of architecture FNN required two hidden layers methodology proposed, excluding 
dataset of TSINM finding interesting. Therefore greater of 1 hidden layer requires FNN where cases in work 
can methodology proposed such that networks neural deep applied. Future work solution same in connections 
optimal find extended further can be work and methodology proposed based on approach hybrid develop 
combined such as PSO,GA and SA techniques that have optimization solutions. 
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